diff --git a/bin/content.sh b/bin/content.sh new file mode 100755 index 00000000..08012f94 --- /dev/null +++ b/bin/content.sh @@ -0,0 +1,7 @@ +#!/usr/bin/env bash + +. ./default-setup + +DEPOSIT_ID=${1-1} + +curl -i -u "${CREDS}" ${SERVER}/1/${COLLECTION}/${DEPOSIT_ID}/content/ diff --git a/docs/getting-started.md b/docs/getting-started.md index e6210078..8d5e8e73 100644 --- a/docs/getting-started.md +++ b/docs/getting-started.md @@ -1,332 +1,332 @@ # Getting Started This is a getting started to demonstrate the deposit api use case with a shell client. The api is rooted at https://deposit.softwareheritage.org. For more details, see the [main README](./README.md). ## Requirements You need to be referenced on SWH's client list to have: - a credential (needed for the basic authentication step). - an associated collection [Contact us for more information.](https://www.softwareheritage.org/contact/) ## Demonstration For the rest of the document, we will: - reference `` as the client and `` as its associated authentication password. - use curl as example on how to request the api. - present the main deposit use cases. The use cases are: - one single deposit step: The user posts in one query (one deposit) a software source code archive and associated metadata (deposit is finalized with status `ready`). This will demonstrate the multipart query. - another 3-steps deposit (which can be extended as more than 2 steps): 1. Create an incomplete deposit (status `partial`) 2. Update a deposit (and finalize it, so the status becomes `ready`) 3. Check the deposit's state This will demonstrate the stateful nature of the sword protocol. Those use cases share a common part, they must start by requesting the `service document iri` (internationalized resource identifier) for information about the collection's location. ### Common part - Start with the service document First, to determine the *collection iri* onto which deposit data, the client needs to ask the server where is its *collection* located. That is the role of the *service document iri*. For example: ``` Shell curl -i --user : https://deposit.softwareheritage.org/1/servicedocument/ ``` If everything went well, you should have received a response similar to this: ``` Shell HTTP/1.0 200 OK Server: WSGIServer/0.2 CPython/3.5.3 Content-Type: application/xml 2.0 209715200 The Software Heritage (SWH) Archive Software Collection application/zip Collection Policy Software Heritage Archive Collect, Preserve, Share false http://purl.org/net/sword/package/SimpleZip https://deposit.softwareheritage.org/1// ``` Explaining the response: - `HTTP/1.0 200 OK`: the query is successful and returns a body response - `Content-Type: application/xml`: The body response is in xml format - `body response`: it is a service document describing that the client `` has a collection named ``. That collection is available at the *collection iri* `/1//` (through POST query). At this level, if something went wrong, this should be authentication related. So the response would have been a 401 Unauthorized access. Something like: ``` Shell curl -i https://deposit.softwareheritage.org/1// HTTP/1.0 401 Unauthorized Server: WSGIServer/0.2 CPython/3.5.3 Content-Type: application/xml WWW-Authenticate: Basic realm="" X-Frame-Options: SAMEORIGIN Access to this api needs authentication processing failed ``` ### Single deposit A single deposit translates to a multipart deposit request. This means, in swh's deposit's terms, sending exactly one POST query with: - 1 archive (`content-type application/zip`) - 1 atom xml content (`content-type: application/atom+xml;type=entry`) The supported archive, for now are limited to zip files. Those archives are expected to contain some form of software source code. The atom entry content is some xml defining metadata about that software. Example of minimal atom entry file: ``` XML Title urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a 2005-10-07T17:17:08Z Contributor The abstract The abstract Access Rights Alternative Title Date Available Bibliographic Citation Contributor Description Has Part Has Version Identifier Is Part Of Publisher References Rights Holder Source Title Type ``` Once the files are ready for deposit, we want to do the actual deposit in one shot. For this, we need to provide: - the contents and their associated correct content-types - either the header `In-Progress` to false (meaning, it's finished after this query) or nothing (the server will assume it's not in progress if not present). - Optionally, the `Slug` header, which is a reference to a unique identifier the client knows about and wants to provide us. You can do this with the following command: ``` Shell curl -i --user : \ -F "file=@deposit.zip;type=application/zip;filename=payload" \ -F "atom=@atom-entry.xml;type=application/atom+xml;charset=UTF-8" \ -H 'In-Progress: false' \ -H 'Slug: some-external-id' \ -XPOST https://deposit.softwareheritage.org/1// ``` You just posted a deposit to the collection https://deposit.softwareheritage.org/1//. If everything went well, you should have received a response similar to this: ``` Shell HTTP/1.0 201 Created Server: WSGIServer/0.2 CPython/3.5.3 Location: /1//10/metadata/ Content-Type: application/xml 9 Sept. 26, 2017, 10:11 a.m. payload - ready + ready http://purl.org/net/sword/package/SimpleZip ``` Explaining this response: - `HTTP/1.0 201 Created`: the deposit is successful - `Location: /1//10/metadata/`: the EDIT-SE-IRI through which we can update a deposit - body response: it is a deposit receipt detailing all endpoints available to manipulate the deposit (update, replace, delete, etc...) It also explains the deposit identifier to be 9 (which is useful for the remaining example). Note: As the deposit is in `ready` status (meaning ready to be injected), you cannot actually update anything after this query. Well, the client can try, but it will be answered with a 403 forbidden answer. ### Multi-steps deposit 1. Create a deposit We will use the collection IRI again as the starting point. We need to explicitely give to the server information about: - the deposit's completeness (through header `In-Progress` to true, as we want to do in multiple steps now). - archive's md5 hash (through header `Content-MD5`) - upload's type (through the headers `Content-Disposition` and `Content-Type`) The following command: ``` Shell curl -i --user : \ --data-binary @swh/deposit.zip \ -H 'In-Progress: true' \ -H 'Content-MD5: 0faa1ecbf9224b9bf48a7c691b8c2b6f' \ -H 'Content-Disposition: attachment; filename=[deposit.zip]' \ -H 'Slug: some-external-id' \ -H 'Packaging: http://purl.org/net/sword/package/SimpleZIP' \ -H 'Content-type: application/zip' \ -XPOST https://deposit.softwareheritage.org/1// ``` The expected answer is the same as the previous sample. 2. Update deposit's metadata To update a deposit, we can either add some more archives, some more metadata or replace existing ones. As we don't have defined metadata yet (except for the `slug` header), we can add some to the `EDIT-SE-IRI` endpoint (/1//10/metadata/). That information is extracted from the deposit receipt sample. Using here the same atom-entry.xml file presented in previous chapter. For example, here is the command to update deposit metadata: ``` Shell curl -i --user : --data-binary @atom-entry.xml \ -H 'In-Progress: true' \ -H 'Slug: some-external-id' \ -H 'Content-Type: application/atom+xml;type=entry' \ -XPOST https://deposit.softwareheritage.org/1//10/metadata/ HTTP/1.0 201 Created Server: WSGIServer/0.2 CPython/3.5.3 Location: /1//10/metadata/ Content-Type: application/xml 10 Sept. 26, 2017, 10:32 a.m. None - partial + partial http://purl.org/net/sword/package/SimpleZip ``` 3. Check the deposit's state You need to check the STATE-IRI endpoint (/1//10/status/). ``` Shell curl -i --user : https://deposit.softwareheritage.org/1//10/status/ HTTP/1.0 200 OK Date: Wed, 27 Sep 2017 08:25:53 GMT Content-Type: application/xml ``` Response: ``` XML 9 - ready - deposit is fully received and ready for injection + ready + deposit is fully received and ready for injection ``` diff --git a/docs/spec-api.md b/docs/spec-api.md index 7749b116..6899f683 100644 --- a/docs/spec-api.md +++ b/docs/spec-api.md @@ -1,790 +1,790 @@ # API Specification This is [Software Heritage](https://www.softwareheritage.org)'s [SWORD 2.0](http://swordapp.github.io/SWORDv2-Profile/SWORDProfile.html) Server implementation. **S.W.O.R.D** (**S**imple **W**eb-Service **O**ffering **R**epository **D**eposit) is an interoperability standard for digital file deposit. This implementation will permit interaction between a client (a repository) and a server (SWH repository) to permit deposits of software source code archives and associated metadata. *Note:* In the following document, we will use the `archive` or `software source code archive` interchangeably. ## Collection SWORD defines a `collection` concept. In SWH's case, this collection refers to a group of deposits. A `deposit` is some form of software source code archive(s) associated with metadata. *Note:* It may be multiple archives if one archive is too big and must be splitted into multiple smaller ones. ### Example As part of the [HAL](https://hal.archives-ouvertes.fr/)-[SWH](https://www.softwareheritage.org) collaboration, we define a `HAL collection` to which the `hal` client will have access to. ## Limitations We will not have a fully compliant SWORD 2.0 protocol at first, so voluntary implementation shortcomings can exist, for example, only zip tarballs will be accepted. Other more permanent limitations exists: - upload limitation of 100Mib - no mediation ## Endpoints Here are the defined endpoints this document will refer to from this point on: - `/1/servicedocument/` *service document iri* (a.k.a [SD-IRI](#sd-iri-the-service-document-iri)) *Goal:* For a client to discover its collection's location - `/1//` *collection iri* (a.k.a [COL-IRI](#col-iri-the-collection-iri)) *Goal:*: create deposit to a collection - `/1///media/` *update iri* (a.k.a [EM-IRI](#em-iri-the-atom-edit-media-iri)) *Goal:*: Add or replace archive(s) to a deposit - `/1///metadata/` *update iri* (a.k.a [EDIT-IRI](#edit-iri-the-atom-entry-edit-iri) merged with [SE-IRI](#se-iri-the-sword-edit-iri)) *Goal:*: Add or replace metadata (and optionally archive(s) to a deposit - `/1///status/` *state iri* (a.k.a [STATE-IRI](#state-iri-the-sword-statement-iri)) *Goal:*: Display deposit's status in regards to injection - `/1///content/` *content iri* (a.k.a [CONT-FILE-IRI](#cont-iri-the-content-iri)) *Goal:*: Display information on the content's representation in the sword server ## Use cases ### Deposit creation From client's deposit repository server to SWH's repository server: [1.] The client requests for the server's abilities and its associated collection (GET query to the *SD/service document uri*) [2.] The server answers the client with the service document which gives the *collection uri* (also known as *COL/collection IRI*). [3.] The client sends a deposit (optionally a zip archive, some metadata or both) through the *collection uri*. This can be done in: - one POST request (metadata + archive). - one POST request (metadata or archive) + other PUT or POST request to the *update uris* (*edit-media iri* or *edit iri*) [3.1.] Server validates the client's input or returns detailed error if any [3.2.] Server stores information received (metadata or software archive source code or both) [4.] The server notifies the client it acknowledged the client's request. An `http 201 Created` response with a deposit receipt in the body response is sent back. That deposit receipt will hold the necessary information to eventually complete the deposit later on if it was incomplete (also known as status `partial`). #### Schema representation ![](/images/deposit-create-chart.png) ### Updating an existing deposit [5.] Client updates existing deposit through the *update uris* (one or more POST or PUT requests to either the *edit-media iri* or *edit iri*). [5.1.] Server validates the client's input or returns detailed error if any [5.2.] Server stores information received (metadata or software archive source code or both) This would be the case for example if the client initially posted a `partial` deposit (e.g. only metadata with no archive, or an archive without metadata, or a splitted archive because the initial one exceeded the limit size imposed by swh repository deposit) #### Schema representation ![](/images/deposit-update-chart.png) ### Deleting deposit (or associated archive, or associated metadata) [6.] Deposit deletion is possible as long as the deposit is still in `partial` state. [6.1.] Server validates the client's input or returns detailed error if any [6.2.] Server actually delete information according to request #### Schema representation ![](/images/deposit-delete-chart.png) ### Client asks for operation status [7.] Operation status can be read through a GET query to the *state iri*. ### Server: Triggering injection Once the status `ready` is reached for a deposit, the server will inject the archive(s) sent and the associated metadata. This is described in the [injection document](./spec-injection.html). ## API overview API access is over HTTPS. The API is protected through basic authentication. The API endpoints are rooted at [https://deposit.softwareheritage.org/1/](https://deposit.softwareheritage.org/1/). Data is sent and received as XML (as specified in the SWORD 2.0 specification). In the following chapters, we will described the different endpoints [through the use cases described previously.](#use-cases) ### [2] Service document Endpoint: GET /1/servicedocument/ This is the starting endpoint for the client to discover its initial collection. The answer to this query will describes: - the server's abilities - connected client's collection information Also known as: [SD-IRI - The Service Document IRI](#sd-iri-the-service-document-iri). #### Sample request ``` Shell GET https://deposit.softwareheritage.org/1/servicedocument/ HTTP/1.1 Host: deposit.softwareheritage.org ``` The server returns its abilities with the service document in xml format: - protocol sword version v2 - accepted mime types: application/zip - upload max size accepted. Beyond that point, it's expected the client splits its tarball into multiple ones - the collection the client can act upon (swh supports only one software collection per client) - mediation is not supported - etc... The current answer for example for the [hal archive](https://hal.archives-ouvertes.fr/) is: ``` XML 2.0 20971520 The Software Heritage (SWH) archive SWH Software Archive application/zip Collection Policy Software Heritage Archive false false Collect, Preserve, Share http://purl.org/net/sword/package/SimpleZip https://deposit.softwareheritage.org/1/hal/ ``` ### [3|5] Deposit creation/update The client can send deposit creation/update through a series of deposit requests to the following endpoints: - *collection iri* (COL-IRI) to initialize a deposit - *update iris* (EM-IRI, EDIT-SE-IRI) to complete/finalize a deposit The deposit creation/update can also happens in one request. The deposit request can contain: - an archive holding the software source code (binary upload) - an envelop with metadata describing information regarding a deposit (atom entry deposit) - or both (multipart deposit, exactly one archive and one envelop). #### Request Types ##### Binary deposit The client can deposit a binary archive, supplying the following headers: - Content-Type (text): accepted mimetype - Content-Length (int): tarball size - Content-MD5 (text): md5 checksum hex encoded of the tarball - Content-Disposition (text): attachment; filename=[filename] ; the filename parameter must be text (ascii) - Packaging (IRI): http://purl.org/net/sword/package/SimpleZip - In-Progress (bool): true to specify it's not the last request, false to specify it's a final request and the server can go on with processing the request's information (if not provided, this is considered false, so final). This is a single zip archive deposit. Almost no metadata is associated with the archive except for the unique external identifier. *Note:* This kind of deposit should be `partial` (In-Progress: True) as almost no metadata can be associated with the uploaded archive. ##### API endpoints concerned POST /1// Create a first deposit with one archive PUT /1///media/ Replace existing archives POST /1///media/ Add new archive ##### Sample request ``` Shell curl -i -u hal: \ --data-binary @swh/deposit.zip \ -H 'In-Progress: false' -H 'Content-MD5: 0faa1ecbf9224b9bf48a7c691b8c2b6f' \ -H 'Content-Disposition: attachment; filename=[deposit.zip]' \ -H 'Slug: some-external-id' \ -H 'Packaging: http://purl.org/net/sword/package/SimpleZIP' \ -H 'Content-type: application/zip' \ -XPOST https://deposit.softwareheritage.org/1/hal/ ``` #### Atom entry deposit The client can deposit an xml body holding metadata information on the deposit. *Note:* This kind of deposit is mostly expected to be `partial` (In-Progress: True) since no archive will be associated to those metadata. ##### API endpoints concerned POST /1// Create a first atom deposit entry PUT /1///metadata/ Replace existing metadata POST /1///metadata/ Add new metadata to deposit ##### Sample request Sample query: ``` Shell curl -i -u hal: --data-binary @atom-entry.xml \ -H 'In-Progress: false' \ -H 'Slug: some-external-id' \ -H 'Content-Type: application/atom+xml;type=entry' \ -XPOST https://deposit.softwareheritage.org/1/hal/ HTTP/1.0 201 Created Date: Tue, 26 Sep 2017 10:32:35 GMT Server: WSGIServer/0.2 CPython/3.5.3 Vary: Accept, Cookie Allow: GET, POST, PUT, DELETE, HEAD, OPTIONS Location: /1/hal/10/metadata/ X-Frame-Options: SAMEORIGIN Content-Type: application/xml 10 Sept. 26, 2017, 10:32 a.m. None - ready + ready http://purl.org/net/sword/package/SimpleZip ``` Sample body: ``` XML Title urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a 2005-10-07T17:17:08Z Contributor The abstract The abstract Access Rights Alternative Title Date Available Bibliographic Citation # noqa Contributor Description Has Part Has Version Identifier Is Part Of Publisher References Rights Holder Source Title Type ``` #### One request deposit / Multipart deposit The one request deposit is a single request containing both the metadata (as atom entry attachment) and the archive (as payload attachment). Thus, it is a multipart deposit. Client provides: - Content-Disposition (text): header of type 'attachment' on the Entry Part with a name parameter set to 'atom' - Content-Disposition (text): header of type 'attachment' on the Media Part with a name parameter set to payload and a filename parameter (the filename will be expressed in ASCII). - Content-MD5 (text): md5 checksum hex encoded of the tarball - Packaging (text): http://purl.org/net/sword/package/SimpleZip (packaging format used on the Media Part) - In-Progress (bool): true|false; true means `partial` upload and we can expect other requests in the future, false means the deposit is done. - add metadata formats or foreign markup to the atom:entry element ##### API endpoints concerned POST /1// Create a full deposit (metadata + archive) PUT /1///metadata/ Replace existing metadata and archive POST /1///metadata/ Add new metadata and archive to deposit ##### Sample request Sample query: ``` Shell curl -i -u hal: \ -F "file=@../deposit.json;type=application/zip;filename=payload" \ -F "atom=@../atom-entry.xml;type=application/atom+xml;charset=UTF-8" \ -H 'In-Progress: false' \ -H 'Slug: some-external-id' \ -XPOST https://deposit.softwareheritage.org/1/hal/ HTTP/1.0 201 Created Date: Tue, 26 Sep 2017 10:11:55 GMT Server: WSGIServer/0.2 CPython/3.5.3 Vary: Accept, Cookie Allow: GET, POST, PUT, DELETE, HEAD, OPTIONS Location: /1/hal/9/metadata/ X-Frame-Options: SAMEORIGIN Content-Type: application/xml 9 Sept. 26, 2017, 10:11 a.m. payload - ready + ready http://purl.org/net/sword/package/SimpleZip ``` Sample content: ``` XML POST deposit HTTP/1.1 Host: deposit.softwareheritage.org Content-Length: [content length] Content-Type: multipart/related; boundary="===============1605871705=="; type="application/atom+xml" In-Progress: false MIME-Version: 1.0 Media Post --===============1605871705== Content-Type: application/atom+xml; charset="utf-8" Content-Disposition: attachment; name="atom" MIME-Version: 1.0 Title hal-or-other-archive-id 2005-10-07T17:17:08Z Contributor The abstract Access Rights Alternative Title Date Available Bibliographic Citation # noqa Contributor Description Has Part Has Version Identifier Is Part Of Publisher References Rights Holder Source Title Type --===============1605871705== Content-Type: application/zip Content-Disposition: attachment; name=payload; filename=[filename] Packaging: http://purl.org/net/sword/package/SimpleZip Content-MD5: [md5-digest] MIME-Version: 1.0 [...binary package data...] --===============1605871705==-- ``` ## Deposit Creation - server point of view The server receives the request(s) and does minimal checking on the input prior to any saving operations. ### [3|5|6.1] Validation of the header and body request Any kind of errors can happen, here is the list depending on the situation: - common errors: - 401 (unauthenticated) if a client does not provide credential or provide wrong ones - 403 (forbidden) if a client tries access to a collection it does not own - 404 (not found) if a client tries access to an unknown collection - 404 (not found) if a client tries access to an unknown deposit - 415 (unsupported media type) if a wrong media type is provided to the endpoint - archive/binary deposit: - 403 (forbidden) if the length of the archive exceeds the max size configured - 412 (precondition failed) if the length or hash provided mismatch the reality of the archive. - 415 (unsupported media type) if a wrong media type is provided - multipart deposit: - 412 (precondition failed) if the md5 hash provided mismatch the reality of the archive - 415 (unsupported media type) if a wrong media type is provided - Atom entry deposit: - 400 (bad request) if the request's body is empty (for creation only) ### [3|5|6.2] Server uploads the content in a temporary location Using an objstorage, the server stores the archive in a temporary location. It's deemed temporary the time the deposit is completed (status becomes `ready`) and the injection finishes. The server also persists requests' information in a database. ### [4] Servers answers the client If everything went well, the server answers either with a 200, 201 or 204 response (depending on the actual endpoint) A `http 200` response is returned for GET endpoints. A `http 201 Created` response is returned for POST endpoints. The body holds the deposit receipt. The headers holds the EDIT-IRI in the Location header of the response. A `http 204 No Content` response is returned for PUT, DELETE endpoints. If something went wrong, the server answers with one of the [error status code and associated message mentioned](#possible errors)). ### [5] Deposit Update The client previously deposited a `partial` document (through an archive, metadata, or both). The client wants to update information for that previous deposit (possibly in multiple steps as well). The important thing to note here is that, as long as the deposit is in status `partial`, the injection did not start. Thus, the client can update information (replace or add new archive, new metadata, even delete) for that same `partial` deposit. When the deposit status changes to `ready`, the client can no longer change the deposit's information (a 403 will be returned in that case). Then aggregation of all those deposit's information will later be used for the actual injection. Providing the collection name, and the identifier of the previous deposit id received from the deposit receipt, the client executes a POST or PUT request on the *update iris*. After validation of the body request, the server: - uploads such content in a temporary location - answers the client an `http 204 (No content)`. In the Location header of the response lies an iri to permit further update. - Asynchronously, the server will inject the archive uploaded and the associated metadata. An operation status endpoint *state iri* permits the client to query the injection operation status. #### Possible update endpoints PUT /1///media/ Replace existing archives for the deposit POST /1///media/ Add new archives to the deposit PUT /1///metadata/ Replace existing metadata (and possible archives) POST /1///metadata/ Add new metadata ### [6] Deposit Removal As long as the deposit's status remains `partial`, it's possible to remove the deposit entirely or remove only the deposit's archive(s). If the deposit has been removed, further querying that deposit will return a *404* response. If the deposit's archive(s) has been removed, we can still ensue other query to update that deposit. ### Operation Status Providing a collection name and a deposit id, the client asks the operation status of a prior deposit. URL: GET /1///status/ This returns: - *201* response with the actual status - *404* if the deposit does not exist (or no longer does) ## Possible errors ### sword:ErrorContent IRI: `http://purl.org/net/sword/error/ErrorContent` The supplied format is not the same as that identified in the Packaging header and/or that supported by the server Associated HTTP Associated HTTP status: *415 (Unsupported Media Type)* ### sword:ErrorChecksumMismatch IRI: `http://purl.org/net/sword/error/ErrorChecksumMismatch` Checksum sent does not match the calculated checksum. Associated HTTP status: *412 Precondition Failed* ### sword:ErrorBadRequest IRI: `http://purl.org/net/sword/error/ErrorBadRequest` Some parameters sent with the POST/PUT were not understood. Associated HTTP status: *400 Bad Request* ### sword:MediationNotAllowed IRI: `http://purl.org/net/sword/error/MediationNotAllowed` Used where a client has attempted a mediated deposit, but this is not supported by the server. Associated HTTP status: *412 Precondition Failed* ### sword:MethodNotAllowed IRI: `http://purl.org/net/sword/error/MethodNotAllowed` Used when the client has attempted one of the HTTP update verbs (POST, PUT, DELETE) but the server has decided not to respond to such requests on the specified resource at that time. Associated HTTP Status: *405 Method Not Allowed* ### sword:MaxUploadSizeExceeded IRI: `http://purl.org/net/sword/error/MaxUploadSizeExceeded` Used when the client has attempted to supply to the server a file which exceeds the server's maximum upload size limit Associated HTTP Status: *413 (Request Entity Too Large)* ### sword:Unauthorized IRI: `http://purl.org/net/sword/error/ErrorUnauthorized` The access to the api is through authentication. Associated HTTP status: *401* ### sword:Forbidden IRI: `http://purl.org/net/sword/error/ErrorForbidden` The action is forbidden (access to another collection for example). Associated HTTP status: *403* ## Nomenclature SWORD uses IRI notion, Internationalized Resource Identifier. In this chapter, we will describe SWH's IRIs. ### SD-IRI - The Service Document IRI The Service Document IRI. This is the IRI from which the client can discover its collection IRI. HTTP verbs supported: *GET* ### Col-IRI - The Collection IRI The software collection associated to one user. The SWORD Collection IRI is the IRI to which the initial deposit will take place, and which is listed in the Service Document. Following our previous example, this is: https://deposit.softwareheritage.org/1/hal/. HTTP verbs supported: *POST* ### Cont-IRI - The Content IRI This is the endpoint which permits the client to retrieve representations of the object as it resides in the SWORD server. This will display information about the content and its associated metadata. HTTP verbs supported: *GET* *Note:* We also refer to it as *Cont-File-IRI*. ### EM-IRI - The Atom Edit Media IRI This is the endpoint to upload other related archives for the same deposit. It is used to change a `partial` deposit in regards of archives, in particular: - replace existing archives with new ones - add new archives - delete archives from a deposit Example use case: A first archive to put exceeds the deposit's limit size. The client can thus split the archives in multiple ones. Post a first `partial` archive to the Col-IRI (with In-Progress: True). Then, in order to complete the deposit, POST the other remaining archives to the EM-IRI (the last one with the In-Progress header to False). HTTP verbs supported: *POST*, *PUT*, *DELETE* ### Edit-IRI - The Atom Entry Edit IRI This is the endpoint to change a `partial` deposit in regards of metadata. In particular: - replace existing metadata (and archives) with new ones - add new metadata (and archives) - delete deposit HTTP verbs supported: *POST*, *PUT*, *DELETE* *Note:* We also refer to it as *Edit-SE-IRI*. ### SE-IRI - The SWORD Edit IRI The sword specification permits to merge this with EDIT-IRI, so we did. *Note:* We also refer to it as *Edit-SE-IRI*. ### State-IRI - The SWORD Statement IRI This is the IRI which can be used to retrieve a description of the object from the sword server, including the structure of the object and its state. This will be used as the operation status endpoint. HTTP verbs supported: *GET* ## Sources - [SWORD v2 specification](http://swordapp.github.io/SWORDv2-Profile/SWORDProfile.html) - [arxiv documentation](https://arxiv.org/help/submit_sword) - [Dataverse example](http://guides.dataverse.org/en/4.3/api/sword.html) - [SWORD used on HAL](https://api.archives-ouvertes.fr/docs/sword) - [xml examples for CCSD](https://github.com/CCSDForge/HAL/tree/master/Sword) diff --git a/swh/deposit/templates/deposit/content.xml b/swh/deposit/templates/deposit/content.xml index f4f0f02d..c1686efa 100644 --- a/swh/deposit/templates/deposit/content.xml +++ b/swh/deposit/templates/deposit/content.xml @@ -1,14 +1,16 @@ {{ deposit_id }} - {{ status }} - {{ status_detail }} + {{ status }} + {{ status_detail }} {% for request in requests %} + {% if request and request.metadata and request.metadata|length > 0 %} - {% for k, v in request.metadata.archive.items %}<{{ k }}>{{ v }} + {% for k, v in request.metadata.items %}<{{ k }}>{{ v }} {% endfor %} + {% endif %} {{ request.date }} {% endfor %} diff --git a/swh/deposit/templates/deposit/deposit_receipt.xml b/swh/deposit/templates/deposit/deposit_receipt.xml index 33f667b1..64cf2c09 100644 --- a/swh/deposit/templates/deposit/deposit_receipt.xml +++ b/swh/deposit/templates/deposit/deposit_receipt.xml @@ -1,19 +1,19 @@ {{ deposit_id }} {{ deposit_date }} {{ archive }} - {{ status }} + {{ status }} {% for packaging in packagings %}{{ packaging }}{% endfor %} diff --git a/swh/deposit/templates/deposit/status.xml b/swh/deposit/templates/deposit/status.xml index 40698950..df64d71d 100644 --- a/swh/deposit/templates/deposit/status.xml +++ b/swh/deposit/templates/deposit/status.xml @@ -1,7 +1,7 @@ {{ deposit_id }} - {{ status }} - {{ status_detail }} + {{ status }} + {{ status_detail }} diff --git a/swh/deposit/tests/api/test_deposit.py b/swh/deposit/tests/api/test_deposit.py index 46d2c896..441fc6f4 100644 --- a/swh/deposit/tests/api/test_deposit.py +++ b/swh/deposit/tests/api/test_deposit.py @@ -1,119 +1,119 @@ # Copyright (C) 2017 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information import hashlib from django.core.urlresolvers import reverse from io import BytesIO from nose.tools import istest from rest_framework import status from rest_framework.test import APITestCase from swh.deposit.config import COL_IRI, EDIT_SE_IRI, DEPOSIT_STATUS_REJECTED from swh.deposit.config import DEPOSIT_STATUS_PARTIAL from swh.deposit.models import Deposit, DepositClient, DepositCollection from swh.deposit.parsers import parse_xml from ..common import BasicTestCase, WithAuthTestCase, CommonCreationRoutine class DepositNoAuthCase(APITestCase, BasicTestCase): """Deposit access are protected with basic authentication. """ @istest def post_will_fail_with_401(self): """Without authentication, endpoint refuses access with 401 response """ url = reverse(COL_IRI, args=[self.collection.name]) # when response = self.client.post(url) # then self.assertEqual(response.status_code, status.HTTP_401_UNAUTHORIZED) class DepositFailuresTest(APITestCase, WithAuthTestCase, BasicTestCase, CommonCreationRoutine): """Deposit access are protected with basic authentication. """ def setUp(self): super().setUp() # Add another user _collection2 = DepositCollection(name='some') _collection2.save() _user = DepositClient.objects.create_user(username='user', password='user') _user.collections = [_collection2.id] self.collection2 = _collection2 @istest def access_to_another_user_collection_is_forbidden(self): """Access to another user collection should return a 403 """ url = reverse(COL_IRI, args=[self.collection2.name]) response = self.client.post(url) self.assertEqual(response.status_code, status.HTTP_403_FORBIDDEN) @istest def delete_on_col_iri_not_supported(self): """Delete on col iri should return a 405 response """ url = reverse(COL_IRI, args=[self.collection.name]) response = self.client.delete(url) self.assertEqual(response.status_code, status.HTTP_405_METHOD_NOT_ALLOWED) @istest def create_deposit_with_rejection_status(self): url = reverse(COL_IRI, args=[self.collection.name]) data = b'some data which is clearly not a zip file' md5sum = hashlib.md5(data).hexdigest() external_id = 'some-external-id-1' # when response = self.client.post( url, content_type='application/zip', # as zip data=data, # + headers CONTENT_LENGTH=len(data), # other headers needs HTTP_ prefix to be taken into account HTTP_SLUG=external_id, HTTP_CONTENT_MD5=md5sum, HTTP_PACKAGING='http://purl.org/net/sword/package/SimpleZip', HTTP_CONTENT_DISPOSITION='attachment; filename=filename0') self.assertEquals(response.status_code, status.HTTP_201_CREATED) response_content = parse_xml(BytesIO(response.content)) actual_state = response_content[ - '{http://www.w3.org/2005/Atom}deposit_state'] + '{http://www.w3.org/2005/Atom}deposit_status'] self.assertEquals(actual_state, DEPOSIT_STATUS_REJECTED) @istest def act_on_deposit_rejected_is_not_permitted(self): deposit_id = self.create_deposit_with_status_rejected() deposit = Deposit.objects.get(pk=deposit_id) assert deposit.status == DEPOSIT_STATUS_REJECTED response = self.client.post( reverse(EDIT_SE_IRI, args=[self.collection.name, deposit_id]), content_type='application/atom+xml;type=entry', data=self.atom_entry_data1, HTTP_SLUG='external-id') self.assertEquals(response.status_code, status.HTTP_400_BAD_REQUEST) self.assertRegex( response.content.decode('utf-8'), "You can only act on deposit with status '%s'" % ( DEPOSIT_STATUS_PARTIAL, )) diff --git a/swh/deposit/tests/api/test_deposit_binary.py b/swh/deposit/tests/api/test_deposit_binary.py index b5a1b75b..933dcffc 100644 --- a/swh/deposit/tests/api/test_deposit_binary.py +++ b/swh/deposit/tests/api/test_deposit_binary.py @@ -1,660 +1,660 @@ # Copyright (C) 2017 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information from django.core.files.uploadedfile import InMemoryUploadedFile from django.core.urlresolvers import reverse from io import BytesIO from nose.tools import istest from rest_framework import status from rest_framework.test import APITestCase from swh.deposit.tests import TEST_CONFIG from swh.deposit.config import COL_IRI, EM_IRI from swh.deposit.config import DEPOSIT_STATUS_READY from swh.deposit.models import Deposit, DepositRequest from swh.deposit.parsers import parse_xml from ..common import BasicTestCase, WithAuthTestCase, create_arborescence_zip from ..common import FileSystemCreationRoutine class DepositTestCase(APITestCase, WithAuthTestCase, BasicTestCase, FileSystemCreationRoutine): """Try and upload one single deposit """ def setUp(self): super().setUp() self.atom_entry_data0 = b""" Awesome Compiler hal urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a %s 2017-10-07T15:17:08Z some awesome author something awesome-compiler This is an awesome compiler destined to awesomely compile stuff and other stuff compiler,programming,language 2005-10-07T17:17:08Z 2005-10-07T17:17:08Z release note related link Awesome https://hoster.org/awesome-compiler GNU/Linux 0.0.1 running all """ self.atom_entry_data1 = b""" hal urn:uuid:2225c695-cfb8-4ebb-aaaa-80da344efa6a 2017-10-07T15:17:08Z some awesome author something awesome-compiler This is an awesome compiler destined to awesomely compile stuff and other stuff compiler,programming,language 2005-10-07T17:17:08Z 2005-10-07T17:17:08Z release note related link Awesome https://hoster.org/awesome-compiler GNU/Linux 0.0.1 running all """ self.atom_entry_data2 = b""" %s """ self.atom_entry_data_empty_body = b""" """ self.atom_entry_data3 = b""" something """ self.data_atom_entry_ok = b""" Title urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a 2005-10-07T17:17:08Z Contributor The abstract The abstract Access Rights Alternative Title Date Available Bibliographic Citation # noqa Contributor Description Has Part Has Version Identifier Is Part Of Publisher References Rights Holder Source Title Type """ @istest def post_deposit_binary_without_slug_header_is_bad_request(self): """Posting a binary deposit without slug header should return 400 """ url = reverse(COL_IRI, args=[self.collection.name]) # when response = self.client.post( url, content_type='application/zip', # as zip data=self.archive['data'], # + headers CONTENT_LENGTH=self.archive['length'], HTTP_CONTENT_MD5=self.archive['md5sum'], HTTP_PACKAGING='http://purl.org/net/sword/package/SimpleZip', HTTP_IN_PROGRESS='false', HTTP_CONTENT_DISPOSITION='attachment; filename=filename0') self.assertIn(b'Missing SLUG header', response.content) self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST) @istest def post_deposit_binary_upload_final_and_status_check(self): """Binary upload with correct headers should return 201 with receipt """ # given url = reverse(COL_IRI, args=[self.collection.name]) external_id = 'some-external-id-1' # when response = self.client.post( url, content_type='application/zip', # as zip data=self.archive['data'], # + headers CONTENT_LENGTH=self.archive['length'], # other headers needs HTTP_ prefix to be taken into account HTTP_SLUG=external_id, HTTP_CONTENT_MD5=self.archive['md5sum'], HTTP_PACKAGING='http://purl.org/net/sword/package/SimpleZip', HTTP_IN_PROGRESS='false', HTTP_CONTENT_DISPOSITION='attachment; filename=%s' % ( self.archive['name'], )) # then response_content = parse_xml(BytesIO(response.content)) self.assertEqual(response.status_code, status.HTTP_201_CREATED) deposit_id = response_content[ '{http://www.w3.org/2005/Atom}deposit_id'] deposit = Deposit.objects.get(pk=deposit_id) self.assertEqual(deposit.status, DEPOSIT_STATUS_READY) self.assertEqual(deposit.external_id, external_id) self.assertEqual(deposit.collection, self.collection) self.assertEqual(deposit.client, self.user) self.assertIsNone(deposit.swh_id) deposit_request = DepositRequest.objects.get(deposit=deposit) self.assertEquals(deposit_request.deposit, deposit) self.assertRegex(deposit_request.archive.name, self.archive['name']) response_content = parse_xml(BytesIO(response.content)) self.assertEqual( response_content['{http://www.w3.org/2005/Atom}deposit_archive'], self.archive['name']) self.assertEqual( response_content['{http://www.w3.org/2005/Atom}deposit_id'], deposit.id) self.assertEqual( - response_content['{http://www.w3.org/2005/Atom}deposit_state'], + response_content['{http://www.w3.org/2005/Atom}deposit_status'], deposit.status) edit_se_iri = reverse('edit_se_iri', args=[self.collection.name, deposit.id]) self.assertEqual(response._headers['location'], ('Location', 'http://testserver' + edit_se_iri)) @istest def post_deposit_binary_upload_only_supports_zip(self): """Binary upload without content_type application/zip should return 415 """ # given url = reverse(COL_IRI, args=[self.collection.name]) external_id = 'some-external-id-1' # when response = self.client.post( url, content_type='application/octet-stream', data=self.archive['data'], # + headers CONTENT_LENGTH=self.archive['length'], HTTP_SLUG=external_id, HTTP_CONTENT_MD5=self.archive['md5sum'], HTTP_PACKAGING='http://purl.org/net/sword/package/SimpleZip', HTTP_IN_PROGRESS='false', HTTP_CONTENT_DISPOSITION='attachment; filename=filename0') # then self.assertEqual(response.status_code, status.HTTP_415_UNSUPPORTED_MEDIA_TYPE) with self.assertRaises(Deposit.DoesNotExist): Deposit.objects.get(external_id=external_id) @istest def post_deposit_binary_fails_if_unsupported_packaging_header( self): """Bin deposit without supported content_disposition header returns 400 """ # given url = reverse(COL_IRI, args=[self.collection.name]) external_id = 'some-external-id' # when response = self.client.post( url, content_type='application/zip', data=self.archive['data'], # + headers CONTENT_LENGTH=self.archive['length'], HTTP_SLUG=external_id, HTTP_CONTENT_MD5=self.archive['md5sum'], HTTP_PACKAGING='something-unsupported', HTTP_CONTENT_DISPOSITION='attachment; filename=filename0') # then self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST) with self.assertRaises(Deposit.DoesNotExist): Deposit.objects.get(external_id=external_id) @istest def post_deposit_binary_upload_fail_if_no_content_disposition_header( self): """Binary upload without content_disposition header should return 400 """ # given url = reverse(COL_IRI, args=[self.collection.name]) external_id = 'some-external-id' # when response = self.client.post( url, content_type='application/zip', data=self.archive['data'], # + headers CONTENT_LENGTH=self.archive['length'], HTTP_SLUG=external_id, HTTP_CONTENT_MD5=self.archive['md5sum'], HTTP_PACKAGING='http://purl.org/net/sword/package/SimpleZip', HTTP_IN_PROGRESS='false') # then self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST) with self.assertRaises(Deposit.DoesNotExist): Deposit.objects.get(external_id=external_id) @istest def post_deposit_mediation_not_supported(self): """Binary upload with mediation should return a 412 response """ # given url = reverse(COL_IRI, args=[self.collection.name]) external_id = 'some-external-id-1' # when response = self.client.post( url, content_type='application/zip', data=self.archive['data'], # + headers CONTENT_LENGTH=self.archive['length'], HTTP_SLUG=external_id, HTTP_CONTENT_MD5=self.archive['md5sum'], HTTP_PACKAGING='http://purl.org/net/sword/package/SimpleZip', HTTP_IN_PROGRESS='false', HTTP_ON_BEHALF_OF='someone', HTTP_CONTENT_DISPOSITION='attachment; filename=filename0') # then self.assertEqual(response.status_code, status.HTTP_412_PRECONDITION_FAILED) with self.assertRaises(Deposit.DoesNotExist): Deposit.objects.get(external_id=external_id) @istest def post_deposit_binary_upload_fail_if_upload_size_limit_exceeded( self): """Binary upload must not exceed the limit set up... """ # given url = reverse(COL_IRI, args=[self.collection.name]) archive = create_arborescence_zip( self.root_path, 'archive2', 'file2', b'some content in file', up_to_size=TEST_CONFIG['max_upload_size']) external_id = 'some-external-id' # when response = self.client.post( url, content_type='application/zip', data=archive['data'], # + headers CONTENT_LENGTH=archive['length'], HTTP_SLUG=external_id, HTTP_CONTENT_MD5=archive['md5sum'], HTTP_PACKAGING='http://purl.org/net/sword/package/SimpleZip', HTTP_IN_PROGRESS='false', HTTP_CONTENT_DISPOSITION='attachment; filename=filename0') # then self.assertEqual(response.status_code, status.HTTP_413_REQUEST_ENTITY_TOO_LARGE) self.assertRegex(response.content, b'Upload size limit exceeded') with self.assertRaises(Deposit.DoesNotExist): Deposit.objects.get(external_id=external_id) @istest def post_deposit_2_post_2_different_deposits(self): """2 posting deposits should return 2 different 201 with receipt """ url = reverse(COL_IRI, args=[self.collection.name]) # when response = self.client.post( url, content_type='application/zip', # as zip data=self.archive['data'], # + headers CONTENT_LENGTH=self.archive['length'], HTTP_SLUG='some-external-id-1', HTTP_CONTENT_MD5=self.archive['md5sum'], HTTP_PACKAGING='http://purl.org/net/sword/package/SimpleZip', HTTP_IN_PROGRESS='false', HTTP_CONTENT_DISPOSITION='attachment; filename=filename0') # then self.assertEqual(response.status_code, status.HTTP_201_CREATED) response_content = parse_xml(BytesIO(response.content)) deposit_id = response_content[ '{http://www.w3.org/2005/Atom}deposit_id'] deposit = Deposit.objects.get(pk=deposit_id) deposits = Deposit.objects.all() self.assertEqual(len(deposits), 1) self.assertEqual(deposits[0], deposit) # second post response = self.client.post( url, content_type='application/zip', # as zip data=self.archive['data'], # + headers CONTENT_LENGTH=self.archive['length'], HTTP_SLUG='another-external-id', HTTP_CONTENT_MD5=self.archive['md5sum'], HTTP_PACKAGING='http://purl.org/net/sword/package/SimpleZip', HTTP_IN_PROGRESS='false', HTTP_CONTENT_DISPOSITION='attachment; filename=filename1') self.assertEqual(response.status_code, status.HTTP_201_CREATED) response_content = parse_xml(BytesIO(response.content)) deposit_id2 = response_content[ '{http://www.w3.org/2005/Atom}deposit_id'] deposit2 = Deposit.objects.get(pk=deposit_id2) self.assertNotEqual(deposit, deposit2) deposits = Deposit.objects.all().order_by('id') self.assertEqual(len(deposits), 2) self.assertEqual(list(deposits), [deposit, deposit2]) @istest def post_deposit_binary_and_post_to_add_another_archive(self): """Updating a deposit should return a 201 with receipt """ # given url = reverse(COL_IRI, args=[self.collection.name]) external_id = 'some-external-id-1' # when response = self.client.post( url, content_type='application/zip', # as zip data=self.archive['data'], # + headers CONTENT_LENGTH=self.archive['length'], HTTP_SLUG=external_id, HTTP_CONTENT_MD5=self.archive['md5sum'], HTTP_PACKAGING='http://purl.org/net/sword/package/SimpleZip', HTTP_IN_PROGRESS='true', HTTP_CONTENT_DISPOSITION='attachment; filename=%s' % ( self.archive['name'], )) # then self.assertEqual(response.status_code, status.HTTP_201_CREATED) response_content = parse_xml(BytesIO(response.content)) deposit_id = response_content[ '{http://www.w3.org/2005/Atom}deposit_id'] deposit = Deposit.objects.get(pk=deposit_id) self.assertEqual(deposit.status, 'partial') self.assertEqual(deposit.external_id, external_id) self.assertEqual(deposit.collection, self.collection) self.assertEqual(deposit.client, self.user) self.assertIsNone(deposit.swh_id) deposit_request = DepositRequest.objects.get(deposit=deposit) self.assertEquals(deposit_request.deposit, deposit) self.assertEquals(deposit_request.type.name, 'archive') self.assertRegex(deposit_request.archive.name, self.archive['name']) # 2nd archive to upload archive2 = create_arborescence_zip( self.root_path, 'archive2', 'file2', b'some other content in file') # uri to update the content update_uri = reverse(EM_IRI, args=[self.collection.name, deposit_id]) # adding another archive for the deposit and finalizing it response = self.client.post( update_uri, content_type='application/zip', # as zip data=archive2['data'], # + headers CONTENT_LENGTH=archive2['length'], HTTP_SLUG=external_id, HTTP_CONTENT_MD5=archive2['md5sum'], HTTP_PACKAGING='http://purl.org/net/sword/package/SimpleZip', HTTP_CONTENT_DISPOSITION='attachment; filename=%s' % ( archive2['name'])) self.assertEqual(response.status_code, status.HTTP_201_CREATED) response_content = parse_xml(BytesIO(response.content)) deposit = Deposit.objects.get(pk=deposit_id) self.assertEqual(deposit.status, DEPOSIT_STATUS_READY) self.assertEqual(deposit.external_id, external_id) self.assertEqual(deposit.collection, self.collection) self.assertEqual(deposit.client, self.user) self.assertIsNone(deposit.swh_id) deposit_requests = list(DepositRequest.objects.filter(deposit=deposit). order_by('id')) # 2 deposit requests for the same deposit self.assertEquals(len(deposit_requests), 2) self.assertEquals(deposit_requests[0].deposit, deposit) self.assertEquals(deposit_requests[0].type.name, 'archive') self.assertRegex(deposit_requests[0].archive.name, self.archive['name']) self.assertEquals(deposit_requests[1].deposit, deposit) self.assertEquals(deposit_requests[1].type.name, 'archive') self.assertRegex(deposit_requests[1].archive.name, archive2['name']) # only 1 deposit in db deposits = Deposit.objects.all() self.assertEqual(len(deposits), 1) @istest def post_deposit_then_post_or_put_is_refused_when_status_ready(self): """Updating a deposit with status 'ready' should return a 400 """ url = reverse(COL_IRI, args=[self.collection.name]) external_id = 'some-external-id-1' # when response = self.client.post( url, content_type='application/zip', # as zip data=self.archive['data'], # + headers CONTENT_LENGTH=self.archive['length'], HTTP_SLUG=external_id, HTTP_CONTENT_MD5=self.archive['md5sum'], HTTP_PACKAGING='http://purl.org/net/sword/package/SimpleZip', HTTP_IN_PROGRESS='false', HTTP_CONTENT_DISPOSITION='attachment; filename=filename0') # then self.assertEqual(response.status_code, status.HTTP_201_CREATED) response_content = parse_xml(BytesIO(response.content)) deposit_id = response_content[ '{http://www.w3.org/2005/Atom}deposit_id'] deposit = Deposit.objects.get(pk=deposit_id) self.assertEqual(deposit.status, DEPOSIT_STATUS_READY) self.assertEqual(deposit.external_id, external_id) self.assertEqual(deposit.collection, self.collection) self.assertEqual(deposit.client, self.user) self.assertIsNone(deposit.swh_id) deposit_request = DepositRequest.objects.get(deposit=deposit) self.assertEquals(deposit_request.deposit, deposit) self.assertRegex(deposit_request.archive.name, 'filename0') # updating/adding is forbidden # uri to update the content edit_se_iri = reverse( 'edit_se_iri', args=[self.collection.name, deposit_id]) em_iri = reverse( 'em_iri', args=[self.collection.name, deposit_id]) # Testing all update/add endpoint should fail # since the status is ready archive2 = create_arborescence_zip( self.root_path, 'archive2', 'file2', b'some content in file 2') # replacing file is no longer possible since the deposit's # status is ready r = self.client.put( em_iri, content_type='application/zip', data=archive2['data'], CONTENT_LENGTH=archive2['length'], HTTP_SLUG=external_id, HTTP_CONTENT_MD5=archive2['md5sum'], HTTP_PACKAGING='http://purl.org/net/sword/package/SimpleZip', HTTP_IN_PROGRESS='false', HTTP_CONTENT_DISPOSITION='attachment; filename=filename0') self.assertEquals(r.status_code, status.HTTP_400_BAD_REQUEST) # adding file is no longer possible since the deposit's status # is ready r = self.client.post( em_iri, content_type='application/zip', data=archive2['data'], CONTENT_LENGTH=archive2['length'], HTTP_SLUG=external_id, HTTP_CONTENT_MD5=archive2['md5sum'], HTTP_PACKAGING='http://purl.org/net/sword/package/SimpleZip', HTTP_IN_PROGRESS='false', HTTP_CONTENT_DISPOSITION='attachment; filename=filename0') self.assertEquals(r.status_code, status.HTTP_400_BAD_REQUEST) # replacing metadata is no longer possible since the deposit's # status is ready r = self.client.put( edit_se_iri, content_type='application/atom+xml;type=entry', data=self.data_atom_entry_ok, CONTENT_LENGTH=len(self.data_atom_entry_ok), HTTP_SLUG=external_id) self.assertEquals(r.status_code, status.HTTP_400_BAD_REQUEST) # adding new metadata is no longer possible since the # deposit's status is ready r = self.client.post( edit_se_iri, content_type='application/atom+xml;type=entry', data=self.data_atom_entry_ok, CONTENT_LENGTH=len(self.data_atom_entry_ok), HTTP_SLUG=external_id) self.assertEquals(r.status_code, status.HTTP_400_BAD_REQUEST) archive_content = b'some content representing archive' archive = InMemoryUploadedFile( BytesIO(archive_content), field_name='archive0', name='archive0', content_type='application/zip', size=len(archive_content), charset=None) atom_entry = InMemoryUploadedFile( BytesIO(self.data_atom_entry_ok), field_name='atom0', name='atom0', content_type='application/atom+xml; charset="utf-8"', size=len(self.data_atom_entry_ok), charset='utf-8') # replacing multipart metadata is no longer possible since the # deposit's status is ready r = self.client.put( edit_se_iri, format='multipart', data={ 'archive': archive, 'atom_entry': atom_entry, }) self.assertEquals(r.status_code, status.HTTP_400_BAD_REQUEST) # adding new metadata is no longer possible since the # deposit's status is ready r = self.client.post( edit_se_iri, format='multipart', data={ 'archive': archive, 'atom_entry': atom_entry, }) self.assertEquals(r.status_code, status.HTTP_400_BAD_REQUEST) diff --git a/swh/deposit/tests/api/test_deposit_status.py b/swh/deposit/tests/api/test_deposit_status.py index 06f89161..87ff7356 100644 --- a/swh/deposit/tests/api/test_deposit_status.py +++ b/swh/deposit/tests/api/test_deposit_status.py @@ -1,92 +1,93 @@ # Copyright (C) 2017 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information from django.core.urlresolvers import reverse from io import BytesIO from nose.tools import istest from rest_framework import status from rest_framework.test import APITestCase from swh.deposit.models import Deposit from swh.deposit.parsers import parse_xml from ..common import BasicTestCase, WithAuthTestCase, FileSystemCreationRoutine from ..common import CommonCreationRoutine from ...config import COL_IRI, STATE_IRI, DEPOSIT_STATUS_READY class DepositStatusTestCase(APITestCase, WithAuthTestCase, BasicTestCase, FileSystemCreationRoutine, CommonCreationRoutine): """Status on deposit """ @istest def post_deposit_with_status_check(self): """Binary upload should be accepted """ # given url = reverse(COL_IRI, args=[self.collection.name]) external_id = 'some-external-id-1' # when response = self.client.post( url, content_type='application/zip', # as zip data=self.archive['data'], # + headers CONTENT_LENGTH=self.archive['length'], HTTP_SLUG=external_id, HTTP_CONTENT_MD5=self.archive['md5sum'], HTTP_PACKAGING='http://purl.org/net/sword/package/SimpleZip', HTTP_IN_PROGRESS='false', HTTP_CONTENT_DISPOSITION='attachment; filename=filename0') # then self.assertEqual(response.status_code, status.HTTP_201_CREATED) deposit = Deposit.objects.get(external_id=external_id) status_url = reverse(STATE_IRI, args=[self.collection.name, deposit.id]) # check status status_response = self.client.get(status_url) self.assertEqual(status_response.status_code, status.HTTP_200_OK) r = parse_xml(BytesIO(status_response.content)) self.assertEqual(r['{http://www.w3.org/2005/Atom}deposit_id'], deposit.id) - self.assertEqual(r['{http://www.w3.org/2005/Atom}status'], + self.assertEqual(r['{http://www.w3.org/2005/Atom}deposit_status'], DEPOSIT_STATUS_READY) - self.assertEqual(r['{http://www.w3.org/2005/Atom}detail'], - 'Deposit is fully received, checked, and ready for ' - 'injection') + self.assertEqual( + r['{http://www.w3.org/2005/Atom}deposit_status_detail'], + 'Deposit is fully received, checked, and ready for ' + 'injection') @istest def status_on_unknown_deposit(self): """Asking for the status of unknown deposit returns 404 response""" status_url = reverse(STATE_IRI, args=[self.collection.name, 999]) status_response = self.client.get(status_url) self.assertEqual(status_response.status_code, status.HTTP_404_NOT_FOUND) @istest def status_with_http_accept_header_should_not_break(self): """Asking deposit status with Accept header should return 200 """ deposit_id = self.create_deposit_partial() status_url = reverse(STATE_IRI, args=[ self.collection.name, deposit_id]) response = self.client.get( status_url, HTTP_ACCEPT='text/html,application/xml;q=9,*/*,q=8') self.assertEqual(response.status_code, status.HTTP_200_OK)